This session is about the basics of stochastic actor-oriented models (SAOMs). We refer to the RSiena page https://www.stats.ox.ac.uk/~snijders/siena/ for further resources. Here we will
It is appropriate that we simulate the model as Siena stands for
Simulation Investigation for Empirical Network Analysis
We will use functionality from the network packages sna
and network (see https://raw.githubusercontent.com/johankoskinen/CHDH-SNA/main/Markdowns/CHDH-SNA-2.Rmd).
The main packages for SAOM is RSiena.
First time you use them, install the packages (uncomment the install commmands)
# install.packages("sna")
# install.packages("network")
# install.packages("RSiena")
Once packages are installed, load them
library("sna")
library("network")
library('RSiena')
The RSiena package automatically loads the van de Bunt’s Freshman dataset (see Data description). We will use time-point 3 and 4
?tmp3# opens the helpfile with information about the dataset
class(tmp3)# both time-points are regular adjacency matrices
class(tmp4)
dim(tmp3)# 32 students by 32
n <- dim(tmp4)[1]
The original tie-variables had 6 levels but have here been dichotomised. It is good practice to check that ties are binary
table(tmp3,useNA = 'always')
table(tmp4,useNA = 'always')
RSiena handles missing data in estimation, but for the purposes of simulating and investigating the network, replace missing values
tmp3[is.na(tmp3)] <- 0 # remove missing
tmp4[is.na(tmp4)] <- 0 # remove missing
Plotting the networks with nodes in the same places illustrates what the SAOM will try to model
par(mfrow = c(1,2))
coordin <- plot(as.network(tmp3), main='Time 3')
plot(as.network(tmp4),coord=coordin, main='Time 4')
Looking closely we see that some arcs have been added and others been removed
In this session, we will assume that we have one initial network \(X(t_0)=x_{obs}(t_0)\), at time \(t_0\) and that we want to say something about a network \(X(t_1)\), at time \(t_1\), \(t_0<t_1\).
We will only use two waves but because of the Markov property of the model, all ingredients extend without limitations to several waves.
To simulate (but also estimate) we need to execute, in turn, each of the functions
sienaDependent - formats class matrix to
class sienaDependentsienaDataCreate - formats data to class
sienagetEffects - provides us with the effects we
can use for the processsienaAlgorithmCreate - sets up simulation/estimation
settingsAfter these steps, the model is simulated/estimated using
siena07
The function sienaDependent formats and translates the
two adjacency matrices to a “sienaDependent”
object:
mynet1 <- sienaDependent(array(c(tmp3, tmp4), # the matrices in order
dim=c(32, 32,2))# are set as two slices in an array
)
For networks, sienaDependent takes networks clued
together in an \(n \times n \times\)
number of waves, array.
You can define, as dependent variables, one-mode networks, two-mode networks, or dependent mondadic variables
Once you have defined your variables, these are combined into a
siena object using sienaDataCreate
mydata <- sienaDataCreate(mynet1)
The siena object adds all relevant information about the data
mydata
## Dependent variables: mynet1
## Number of observations: 2
##
## Nodeset Actors
## Number of nodes 32
##
## Dependent variable mynet1
## Type oneMode
## Observations 2
## Nodeset Actors
## Densities 0.15 0.18
To determined what effects are available for our combination of different types of data
myeff <- getEffects(mydata)
Assume a model where an actor \(i\), when they make a change, only cares about how many ties they have.
myeff <- includeEffects(myeff, recip,include=FALSE)# remove reciprocity which is DEFAULT
## [1] effectName include fix test initialValue
## [6] parm
## <0 rows> (or 0-length row.names)
myeff$initialValue[myeff$shortName=='Rate'] <- 5.7288
myeff$initialValue[myeff$shortName=='density'][1] <- -0.7349
For later reference, notice how
myeffis on the right-hand side of the allocation ofincludeEffects
What does the rate \(\lambda = 5.7288\) mean?
Each time the network has changed (or an oppotunity to change), each actor waits \(T_i \overset{i.i.d.}{\thicksim} Exp(\lambda)\) time.
The actor that gets to change is the one who waits for the shortest amount of time
waiting <- rexp(32, 5.7288)
hist(waiting)
winner <- which( waiting == min(waiting))
paste('The winner is actor: ', winner)
## [1] "The winner is actor: 19"
In the example of \(i=1\), deciding between creating a tie to \(j=2,4\) or breaking the tie to \(j=3\)
With our current model
\[ \Pr(X(1\leadsto 2)=\Pr(X(1\leadsto 4)=\frac{e^{\beta_1(1-2x_{12})}}{1+\sum_{j\neq i} e^{\beta(1-2x_{ij})}}=\frac{\exp(-0.7349)}{\exp(-0.7349)+\exp(0.7349)+\exp(-0.7349)+1} \] and
\[ \Pr(X(1\leadsto 3)=\frac{e^{\beta_1(1-2x_{13})}}{1+\sum_{j\neq i} e^{\beta(1-2x_{ij})}}=\frac{\exp(0.7349)}{\exp(-0.7349)+\exp(0.7349)+\exp(-0.7349)+1} \] The conditional probability of \(i=1\) creating the tie to \(j=2\) is thus
exp(-0.7349)/(exp(-0.7349)+exp(0.7349)+exp(-0.7349)+1)
## [1] 0.1185728
and the conditional probability of \(i=1\) deleting the tie to \(j=3\) is thus
exp(0.7349)/(exp(-0.7349)+exp(0.7349)+exp(-0.7349)+1)
The function sienaAlgorithmCreate determines the
simulation/estimation settings. Here we are going to
simOnly = TRUEn3 = 100Networks, starting in \(X(t_3)\)
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',# name will be appended to output
cond = FALSE, # NOT conditioning on num. changes
useStdInits = FALSE,# we are changing some defaults
nsub = 0 ,# more about subphases in estimation
n3 = 100,# number of simulations (in Phase 3)
simOnly = TRUE)# we only want to simulate
We are ready to simulate!
The function siena07 is the main engine of RSiena and is
used to estimate all models (apart from hierarchical SAOMS)
sim_ans <- siena07( sim_model,# the name of our model
data = mydata,# all of our data - see above for what is in there
effects = myeff,# the effects we have chosen, including parameters
returnDeps = TRUE,# save simulated networks
batch=TRUE )# batch=FALSE opens a GUI
The networks are in sim_ans$sims but to help with
extracting and formatting them we read in the function
reshapeRSienaDeps
source("https://raw.githubusercontent.com/johankoskinen/CHDH-SNA/main/data/reshapeRSienaDeps.R")
We apply it on sim_ans
mySimNets <- reshapeRSienaDeps(sim_ans,n)# n is the number of nodes (defined above)
The object mySimNets is an 100 by 32 by 32
array with 9 adjacency matrices
Plot networks with the same layout as for \(X(t_3)\) above
par(mfrow=c(2,5),# want 2 by 5 panels
oma = c(0,1,0,0) + 0.1,# custom outer margins
mar = c(1,0,0,0) + 0.1)# custom inner margins
plot(as.network(tmp4),# observed network at t4
coord=coordin) # observed network
invisible(# need to supress printing to consol
apply(mySimNets[1:9,,],# for the first 9 networks in array
1,# along first dimenstion, apply the following function
function(x)
plot(as.network(x),coord=coordin) ) )
It is hard to spot any differences. Plot density of the networks
hist( gden(mySimNets ) )
abline( v = gden(tmp4), col='red')
If actors only care about how many ties they have, we predict the density at \(t_4\) correctly
How about the number of reciprocated ties \(x_{ij}x_{ji}=1\)
hist( dyad.census(mySimNets)[,1] ,xlim=c(9,50))
abline(v = dyad.census(tmp4)[1], col='red')
Clearly, if the only rule actors have is to care about the number of ties they have, this is not enough to explain why there are so many (46) reciprocal dyads
Simulate assuming that actors want to have reciprocated ties to others with the model with probabilities \(\Pr(X(i \leadsto j) |i \text{ makes decision})\): \[ p_{ij}(X,\beta)=\frac{e^{\beta_d (1-2x_{ij}) +\beta_r (1-2x_{ij})x_{ji} } }{ \sum_h e^{\beta_d (1-2x_{ih}) +\beta_r (1-2x_{ih})x_{hj} } } \]
with parameters \(\beta_d=-1.1046\) and \(\beta_r=-1.2608\):
myeff <- includeEffects(myeff, recip,include=TRUE)# reinstate reciprocity which is DEFAULT
## effectName include fix test initialValue parm
## 1 reciprocity TRUE FALSE FALSE 0 0
myeff$initialValue[myeff$shortName =='recip'][1] <- 1.2608
#### === We also need to change the other values
myeff$initialValue[myeff$shortName=='Rate'] <- 6.3477
myeff$initialValue[myeff$shortName=='density'][1] <- -1.1046
Log odds for a new graph
| \(x_{ji}\) | |||
|---|---|---|---|
| \(x_{ij}\) | 0 | 1 | |
| 0 | 0 | \(\beta_d\) | |
| 1 | \(\beta_d\) | \(\beta_d+\beta_r\) |
Set up the algorithm
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 100,
simOnly = TRUE)
Run simulation
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Reshape and extract networks:
mySimNets <- reshapeRSienaDeps(sim_ans,n)
Plot, using the same code as before
par(mfrow=c(2,5),# want 2 by 5 panels
oma = c(0,1,0,0) + 0.1,# custom outer margins
mar = c(1,0,0,0) + 0.1)# custom inner margins
plot(as.network(tmp4),# observed network at t4
coord=coordin) # observed network
invisible(# need to supress printing to consol
apply(mySimNets[1:9,,],# for the first 9 networks in array
1,# along first dimenstion, apply the following function
function(x)
plot(as.network(x),coord=coordin) ) )
Compare the observed number of reciprocated dyads to the simulated number of reciprocated dyads
hist( dyad.census(mySimNets)[,1])
abline(v = dyad.census(tmp4)[1], col='red')
With an actor preference of 1.26 for having their ties reicprocated we reproduce the reciprocity
What about the triad cencus? Check the first 9
triad.census(tmp4)# observed
triad.census(mySimNets[1:9,,])# first 9 simulated networks
Look at the distributions for transitive triads and dense triads
par(mfrow=c(1,2))
hist( triad.census(mySimNets)[,9], xlim=c(8,40))# column 9 is 030T
abline(v = triad.census(tmp4)[9], col='red')
hist( triad.census(mySimNets)[,16], xlim=c(4,24))# column 16 is 300
abline(v = triad.census(tmp4)[16], col='red')
A prefence for reciprocity is not enough to explain why reciprocated dyads hang together in triangles or why there are so many transitive triads
Add another positive parameter \(\beta_t\) for the preference for closing open triads.
myeff <- includeEffects(myeff, recip,include=TRUE) # add the effect reciprocated ties
myeff <- includeEffects(myeff, transTrip,include=TRUE)
myeff$initialValue[myeff$shortName=='Rate'] <- 7.0959
myeff$initialValue[myeff$shortName=='density'][1] <- -1.6468
myeff$initialValue[myeff$shortName=='recip'][1] <- 0.8932
myeff$initialValue[myeff$shortName=='transTrip'][1] <- 0.2772
Set up the algorithm
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 100,
simOnly = TRUE)
Run simulation
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Reshape and extract networks:
mySimNets <- reshapeRSienaDeps(sim_ans,n)
What about the triad cencus? Check the first 9
triad.census(tmp4)# observed
triad.census(mySimNets[1:9,,])# first 9 simulated networks
Look at the distributions for transitive triads and dense triads
par(mfrow=c(1,2))
hist( triad.census(mySimNets)[,9], xlim=c(2,55))# column 9 is 030T
abline(v = triad.census(tmp4)[9], col='red')
hist( triad.census(mySimNets)[,16], xlim=c(4,86))# column 16 is 300
abline(v = triad.census(tmp4)[16], col='red')
Having preferences for reciprocated ties and transitive ties seem to explain ‘most’ of the structure
How did I pick the numbers
Manually, you could
Luckily, we have an algorithm (stochastic approximation) for automating this
We define the model in the same way as when we simulated data
myeff <- getEffects(mydata)# We already have our effects, but let's start from scratch
myeff <- includeEffects(myeff, recip,include=TRUE) # add the effect reciprocated ties (it is alredy include by DEFAULT though)
myeff <- includeEffects(myeff, transTrip,include=TRUE)# we want to estimate the trasitivity effect
Set up the algorithm with default values
(siena07 will assume you want to estimate)
est_model <- sienaAlgorithmCreate(
projname = 'est_model',
n3 = 1000,# number of simulations in Phase 3 (default *is* 1000)
simOnly = FALSE,# default *is* FALSE
cond = FALSE,
useStdInits = FALSE)
Estimate!
est_ans <-siena07(est_model,# algorithm object
data = mydata, # the data object we created earlier
effects = myeff,# same effects as in simulation
returnDeps = TRUE,
batch=TRUE)
Estimation gives us an ANOVA table with - Parameter (point) estimates - Standard errors of estimates - Convergence statistis
summary( est_ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
## 1. rate basic rate parameter mynet1 7.0602 ( 0.9044 ) -0.0443
## 2. eval outdegree (density) -1.6435 ( 0.1352 ) -0.0153
## 3. eval reciprocity 0.9010 ( 0.2058 ) 0.0222
## 4. eval transitive triplets 0.2754 ( 0.0450 ) 0.0162
##
## Overall maximum convergence ratio: 0.1014
##
##
## Total of 1796 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.818 -0.015 -0.003 0.000
## -0.119 0.018 -0.008 -0.004
## -0.015 -0.300 0.042 -0.002
## 0.002 -0.679 -0.171 0.002
##
## Derivative matrix of expected statistics X by parameters:
##
## 11.962 5.487 4.120 41.958
## 71.989 230.721 157.008 1302.602
## 18.128 75.919 87.283 490.981
## 190.061 631.165 500.261 4820.576
##
## Covariance matrix of X (correlations below diagonal):
##
## 127.331 120.448 83.640 841.117
## 0.560 363.124 271.906 2338.000
## 0.462 0.889 257.867 1930.051
## 0.555 0.913 0.895 18046.277
These estimates look very much like the numbers that I picked arbitrarily - what makes these better
The observed statistics that we want to ‘hit’ on average are
est_ans$targets
## [1] 125 175 92 431
# the same as sim_ans$targets
The simulated statistics for the parameters from the estimation are
colMeans(est_ans$sf2)
## [,1] [,2] [,3] [,4]
## [1,] 124.5 174.708 92.356 433.172
# also provided in:
# est_ans$estMeans
The simulated statistics for the parameters from the simulation are
sim_ans$estMeans
## [1] 123.55094 174.01573 90.72833 419.69273
We can calculate within how many standard deviation units of the targets we are for each statistic \[ \frac{\bar{z}_k-z_{obs}}{sd(z_k)} \] The deviations using the estimates from estimation:
(colMeans(est_ans$sf2)-est_ans$targets)/sqrt(diag(var(est_ans$sf2[,1,])))
## [,1] [,2] [,3] [,4]
## [1,] -0.04431006 -0.01532341 0.0221693 0.01616836
# Also provided in:
# est_ans$tstat
For estimates from simulation:
(colMeans(sim_ans$sf2)-sim_ans$targets)/sqrt(diag(var(sim_ans$sf2[,1,])))
For both sets of parameters, the simulated statistics are on average within less that 0.1 standard deviations units of the observed statistics
As a quick illustration, we can see when we set rate, density, and reciprocity to the estimated values
myeff$initialValue[myeff$shortName=='Rate'] <- est_ans$theta[1]
myeff$initialValue[myeff$shortName=='density'][1] <- est_ans$theta[2]
myeff$initialValue[myeff$shortName=='recip'][1] <-est_ans$theta[3]
but pick another value for transitivity:
myeff$initialValue[myeff$shortName=='transTrip'][1] <- 0.1
and then, simulate!
sim_model <- sienaAlgorithmCreate(
projname = 'sim_model',
cond = FALSE,
useStdInits = FALSE,
nsub = 0 ,
n3 = 1000,
simOnly = TRUE)
sim_ans <- siena07( sim_model, data = mydata,
effects = myeff,
returnDeps = TRUE, batch=TRUE )
Calculate the scaled difference between the average statistics and the observed statistics again
(colMeans(sim_ans$sf2)-sim_ans$targets)/sqrt(diag(var(sim_ans$sf2[,1,])))
The everage simulated statistics:
sim_ans$estMeans
are not close to the observed
sim_ans$targets
Decreasing the transitivity parameter results in networks having too few transitive triads
In general, we say that the estimation has converged and estimates can be estimated if
The aim of the estimation algorithm is to find estimates \(\hat{\theta}\), such that \[
\underbrace{E_{\hat{\theta}}\{ Z[X(t_1)] \mid
X(t_0)=x_{obs}(t_0)\}}_{\text{'average'
statistics}}=Z[x_{obs}(t_1)]
\] The stochastic approximation algorithm in siena07
solves this equation computationally in three
Phases:
You may notice that the difference that is minimised in this algoritm is not \(\bar{z}-z_{obs}\). Only one network is simulated in each interation - but it works (the other way also works but is less efficient)
The second column in the ANOVA table contains the nominal standard errors, i.e. the approximate standard deviations of the estimators of the \(\theta\)’s. Typically, these are used for standard hypothesis tests:
For effect/parameter \(k\), test \[ H_0:\beta_k=\beta_{k,0}=0 \] against \[ H_0:\beta_k\neq 0 \] The test statistic \[ \frac{\hat{\beta}_k-\beta_{k,0} }{sd(\hat{\beta}_k)}=\frac{\hat{\beta}_k }{sd(\hat{\beta}_k)}\approx \frac{\hat{\beta}_k }{se(\hat{\beta}_k)}, \] is approximately normally distributed \(N(0,1)\), if \(H_0\) is true.
Given the number of assumptions and approximations we need to make, I would advice that we stick to testing on the nominal 5% level, and reject \(H_0\) when the test statistic is greater than \(2\) in absolute value.
In the simple model we estimated, let us test \(H_0:\beta_t=0\). The observed test statistic
est_ans$theta[4]/est_ans$se[4]
## [1] 6.113833
is considerably larger than 2, and we can reject the null-hypothesis that the true value of the transitivity parameter is \(0\).
Intuitively, we could also test if we ‘need’ \(\beta_t\), by estimating the model with \(\beta_t=0\), and check the convergence t-statistic. You can do this yourself now!
There are many different types of covariates
| Type | RSiena type |
|---|---|
| Constant monadic covariates | coCovar |
| Changing monadic covariates | varCovar |
| Constant dyadic covariate | coDyadCovar |
| Changing dyadic covariate | varDyadCovar |
| Changing (covariate) network | sienaDependent |
The usual nomenclauture for measurement scales, and the distinction between metric and non-metric variables, applies.
The adjacency matrices s501, s502, and
s503, for the so-called s-50 dataset, the network of 50
girls, are also available with RSiena. For a full
description of this dataset see ?s50 or http://www.stats.ox.ac.uk/~snijders/siena/s50_data.htm
For clarity, we rename the adjacency matrices
friend.data.w1 <- s501
friend.data.w2 <- s502
friend.data.w3 <- s503
Note that we have three (3) waves.
Among the many monadic covariates are ‘smoking’ and ‘drinking’:
drink <- s50a
smoke <- s50s
Here, each variable had its own \(n \times 3\) data matrix, one observation for each individual and each time-point
head(smoke)
We are going to test \[ H_0: \text{smokers have as many friends as non-smokers} \] Against an alternative hypothesis stating that there is a difference. We will interpret this as “sending as many ties” and “receiving as many ties” as non-smokers.
We will use smoking at \(t_0\) as our explanatory variable and define ‘smoking’ as a value of 2 or 3.
smoke1.matrix <- as.numeric( smoke[,1]>1 )
table(smoke1.matrix, useNA='always')
To tell RSiena how to use this variable, we format it
smoke1 <- coCovar( smoke1.matrix )
Print to screen to see how R has interpreted the variable
smoke1
Format the dependent network variable as before:
friendshipData <- array( c( friend.data.w1,
friend.data.w2,
friend.data.w3 ),
dim = c( 50, 50, 3 ) )
friendship <- sienaDependent(friendshipData)
Using sienaDataCreate, we give RSiena the
oppotunity to figure out how data are structured and what types of
effects can be calculated from it
s50data <- sienaDataCreate( friendship, smoke1)
Since we have both a network as well as monadic covariates, we will have more effects avaialble to us that previously
s50.effects <- getEffects(s50data)
For testing our hypothesis we want a statistic that corresponds to \[ v_i x_{ij} \] for each tie-variable, and where \(v_i\) is one or zero according to whether \(i\) is a smoker or not, respectively. This is the sender effect
s50.effects <- includeEffects( s50.effects,# we "add and effect" to our effects object
egoX,# the shortname here evokes that variable for 'ego' affects decision
interaction1 = "smoke1" # the variable we want to interact the DV with
)
Next, we want a statistic that corresponds to \[ v_j x_{ij} \] so that if the corresponding parameter is positive, then actors are more likely to form (maintain) ties to actors \(j\) that have \(v_j=1\), i.e. are smokers
s50.effects <- includeEffects( s50.effects,
altX,# the shortname here evokes that variable for 'alter' affects decision of ego
interaction1 = "smoke1" # the variable we want to interact the DV with
)
For the vanDeBunt dataset, we saw that triadic closure was important. We can chose to include it because we believe that it is important but also as a control for our hypothesised effects
s50.effects <- includeEffects(s50.effects,# We "add and effect" to s50.effects
transTrip,# shortname of the effect
include=TRUE)
Our specified model is
s50.effects
## effectName include fix test initialValue parm
## 1 constant friendship rate (period 1) TRUE FALSE FALSE 4.69604 0
## 2 constant friendship rate (period 2) TRUE FALSE FALSE 4.32885 0
## 3 outdegree (density) TRUE FALSE FALSE -1.46770 0
## 4 reciprocity TRUE FALSE FALSE 0.00000 0
## 5 transitive triplets TRUE FALSE FALSE 0.00000 0
## 6 smoke1 alter TRUE FALSE FALSE 0.00000 0
## 7 smoke1 ego TRUE FALSE FALSE 0.00000 0
Specify the simulation settings
s50algorithm.simple <- sienaAlgorithmCreate( projname = 's50_simple' )# We are only using defaults
Estimating the model with default settings is simply a callt to siena07
s50.simple.ans <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE )
Print the results to screen
summary( s50.simple.ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
##
## Rate parameters:
## 0.1 Rate parameter period 1 6.4444 ( 1.0722 )
## 0.2 Rate parameter period 2 5.2198 ( 0.8698 )
##
## Other parameters:
## 1. eval outdegree (density) -2.6794 ( 0.1181 ) -0.0411
## 2. eval reciprocity 2.4578 ( 0.2102 ) -0.0353
## 3. eval transitive triplets 0.6177 ( 0.0755 ) -0.0310
## 4. eval smoke1 alter 0.0055 ( 0.1824 ) 0.0398
## 5. eval smoke1 ego 0.0471 ( 0.1974 ) -0.0509
##
## Overall maximum convergence ratio: 0.1362
##
##
## Total of 2203 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.014 -0.016 -0.004 -0.003 0.001
## -0.635 0.044 -0.003 0.003 0.003
## -0.410 -0.164 0.006 0.000 -0.002
## -0.152 0.083 -0.019 0.033 -0.012
## 0.031 0.084 -0.118 -0.325 0.039
##
## Derivative matrix of expected statistics X by parameters:
##
## 316.194 251.881 545.392 7.338 4.700
## 141.860 148.192 286.710 0.569 0.031
## 366.271 331.308 1191.570 11.931 12.269
## 14.844 10.751 26.358 43.201 22.967
## 0.915 -1.618 2.579 24.688 38.514
##
## Covariance matrix of X (correlations below diagonal):
##
## 506.789 439.782 1203.778 2.808 3.903
## 0.936 435.173 1159.266 3.008 2.734
## 0.807 0.839 4386.436 1.996 0.741
## 0.016 0.019 0.004 58.793 42.313
## 0.024 0.018 0.002 0.757 53.139
Are all convergence statistics are less than 0.1 in absolute value? If so we can test our hypothesis - do we reject our \(H_0\)?
To account for (test) possible assortativity on smoking, we add the homophily effect:
s50.effects <- includeEffects( s50.effects,# we "add and effect" to our effects object
egoXaltX,# the shortname
interaction1 = "smoke1" # the variable we want to interact the DV with
)
s50.hom.ans <- siena07( s50algorithm.simple,# estimation settings
data = s50data,# data object that knows DV and IV
effects = s50.effects,# the effects we specified
batch = TRUE,
returnDeps = TRUE )
Print the results to screen
summary( s50.hom.ans )
## Estimates, standard errors and convergence t-ratios
##
## Estimate Standard Convergence
## Error t-ratio
##
## Rate parameters:
## 0.1 Rate parameter period 1 6.5620 ( 1.1765 )
## 0.2 Rate parameter period 2 5.2829 ( 0.8868 )
##
## Other parameters:
## 1. eval outdegree (density) -2.6963 ( 0.1103 ) -0.0655
## 2. eval reciprocity 2.4384 ( 0.1884 ) -0.0420
## 3. eval transitive triplets 0.6120 ( 0.0720 ) -0.0406
## 4. eval smoke1 alter -0.0870 ( 0.2002 ) -0.0799
## 5. eval smoke1 ego -0.0078 ( 0.1958 ) -0.0721
## 6. eval smoke1 ego x smoke1 alter 0.6426 ( 0.3611 ) -0.0773
##
## Overall maximum convergence ratio: 0.1190
##
##
## Total of 2149 iteration steps.
##
## Covariance matrix of estimates (correlations below diagonal)
##
## 0.012 -0.012 -0.003 0.000 0.001 -0.006
## -0.594 0.035 -0.003 0.003 0.002 -0.002
## -0.356 -0.185 0.005 -0.001 -0.002 -0.001
## -0.016 0.077 -0.044 0.040 -0.012 -0.018
## 0.027 0.049 -0.117 -0.310 0.038 -0.015
## -0.142 -0.024 -0.028 -0.251 -0.206 0.130
##
## Derivative matrix of expected statistics X by parameters:
##
## 292.885 231.918 484.151 5.010 7.566 20.361
## 130.474 141.767 257.600 0.879 2.436 9.478
## 339.930 315.561 1150.329 14.733 20.300 29.838
## 8.007 6.328 28.606 43.381 27.393 10.599
## 10.764 10.420 45.175 30.361 44.497 11.262
## 23.603 20.690 49.639 10.349 10.196 13.801
##
## Covariance matrix of X (correlations below diagonal):
##
## 478.754 412.769 1130.343 6.258 12.087 41.163
## 0.934 408.305 1087.786 8.510 14.420 38.583
## 0.805 0.839 4121.622 31.279 51.322 108.502
## 0.035 0.052 0.060 65.309 53.015 17.954
## 0.069 0.089 0.100 0.820 64.074 18.485
## 0.420 0.426 0.377 0.496 0.516 20.050
If
we can test our hypothesis, controlling for possible assortativity/homophily
What about homophily on smoking - do smokers befried other smokers?